- 
                Notifications
    You must be signed in to change notification settings 
- Fork 717
support 64-bit file offsets on systems with 32-bit off_t #2032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| The unused_mut complaints in src/sys/socket/mod.rs exist before this PR. I've sent PR #2033 to fix those. | 
| #2033 was merged. I've rebased this PR on top of that. I expect it will pass the checks now. I've made a bunch of commits addressing check failures - would you like to squash those before review? | 
| Does anyone have thoughts on this? @asomers ? | 
| Any comments? | 
| This PR will break backwards compatibility for callers who pass a  fn pread<Off: Into<libc::off_t>>(f: RawFd, buf: &mut [u8], offset: Off> {...} | 
| 
 Assuming you meant "Into<off_t>" (that is, accept any type (including libc::off_t) that can be converted into a nix::off_t). I'll see how that goes. | 
4d35ffa    to
    10d3fea      
    Compare
  
    | Ok, uclibc is interesting. It can be compiled so that file offsets are always 64-bit (__USE_FILE_OFFSET64) and it can be compiled to provide the *64() functions and off64_t (__USE_LARGEFILE64). However, it does not apply these options to preadv() and pwritev(). Instead, preadv() and pwritev() always take 64-bit offsets (regardless of __USE_FILE_OFFSET64), and preadv64() and pwritev64() are never provided (regardless of __USE_LARGEFILE64). My original plan was to focus on making nix work with large files when using glibc and leave everything else untouched as much as possible. After having tried a couple of variations of the code as it relates to uclibc, it's still in that spirit. I have changed the comments to make it clear that "uclibc doesn't use off_t" is specific to preadv and pwritev, but other than that, the code does the same thing it did before when using uclibc. This means uclibc on 32-bit Linux is still using 32-bit file offsets - something we may want to address in a follow-up. | 
| I've made most of the relevant functions use <Off: Into<off_t>> so that they can accept either a nix::off_t or a libc::off_t. This should increase compatibility with existing code. There are, however, still a few cases where a change to calling code is required: 
 I also ran into a situation with mmap. I can add Off: Into<off_t> easily enough, and that will work fine for call sites that don't pass any type parameters. But there are a number of cases in the tests where it's being called as mmap(...). This is because None is being passed as the optional f, and we need to tell the compiler what type the None is. If I change the number of type parameters, such call sites will have to be updated. I'm not really happy with how that looks, so I've left mmap alone for now. I suspect that, in practice, the offset being passed to mmap is usually a literal 0, but, of course, I don't really know all the code that's out there. | 
| 
 Actually, I meant  | 
03b0935    to
    28ff751      
    Compare
  
    | Status update, since I've been called away to other duties for a while: u64 is going to be a bit of a challenge, because off_t is signed. This means, among other things, that code that passes in a libc::off_t will fail to compile if we switch to u64. It does have the nice property that it's also what std uses (std::fs::Metadata::len is u64, and std::io::Seek::stream_position returns Result, to name a few examples). Alternatively, we could do i64. I made some changes to see what that would look like, but was called away to other duties. I've uploaded what I have, which doesn't yet pass tests on all targets, but should give an idea. Mostly, it simplifies things compared to having some nix-specific offset type, which I like. The biggest difference between this and the previous version (which used nix::off_t) is that i64 is always 64-bit, whereas nix::off_t as I had coded it was 64-bit on most targets, but could be a different size on some - for example, on 32-bit Linux where the C library is not glibc. This is what many of the check failures are about: we're calling a libc function which expects a 32-bit off_t, but we're giving it an i64. This can be fixed; I just haven't gotten around to it. In the repo where I'm doing the development, I have a function to_off_t which converts the offset passed into a function into the type that the libc function requires and returns Err(Errno::EOVERFLOW) if this does not succeed. Here is an example of how it is used. This has the same effect as lseek returning EOVERFLOW, which it is documented to do if the file position cannot be represented in an off_t. Advantages of this approach: 
 Disadvantages: 
 | 
| Yes, I wasn't thinking about signedness.   
 | 
| 
 The BSDs have 64-bit off_t even on 32-bit architectures. So does Haiku if I read the source correctly. The only platform I know of that has 32-bit off_t is 32-bit Linux. I think you can compile a libc for Linux that has 32-bit off_t and doesn't provide the *64() functions. Do we want to support such a configuration? I think this already currently doesn't work, because, for example, we provide lseek64 on Linux regardless of word size. | 
| I've discovered this issue compiling for arm-unknown-linux-gnueabi. In particular, using the fcntl API assumes the 32bit version of flock inside FcntlArg::F_SETLK and friends, and I repeated this same problem in #2300 (noting that this doesn't block merging that, the fix will be the same for all APIs). | 
3966cf7    to
    eeaed43      
    Compare
  
    979b487    to
    dc82452      
    Compare
  
    | It seems that 32-bit Android, too, uses 32-bit file offsets. Let's see if I can make 64-bit offsets work there, too. | 
| Per https://android.googlesource.com/platform/bionic/+/main/docs/32-bit-abi.md, Android has the *64 functions available, so I think the same change for glibc will work for Android, too. | 
3db46c2    to
    117eb6a      
    Compare
  
    Adds large file support (64-bit file positions) to the existing Nix
API when targetting systems that have separate 32-bit and 64-bit
APIs.
On many platforms that support 64-bit file positions, this support is
present in the standard I/O functions. On others, there have
historically been two APIs: the standard API, which limits file size to
2**31-1 bytes, and a separate API that suffixes function names with "64"
and supports 64-bit file sizes. This "transitional interface for 64-bit
file offsets" is not present on every platform. As a result, using the
*64 API will make programs non-portable, whereas using the standard API
will result in programs that cannot handle large file offsets on some
systems, even if those systems actually do have a way to handle
large file offsets.
This change enables 64-bit file offsets on the following platforms
where the standard API only provides 32-bit offsets:
 - 32-bit Linux with glibc.
 - 32-bit Android.
To support large files with a consistent API across platforms, this
change makes Nix functions call the 64-bit capable equivalents where
necessary. Other C libraries may not provide the *64 API, so we continue
to call the standard functions on those.
Broadly, the change consists of 5 parts:
 1. Uses of libc::off_t have are replaced by i64. This provides a
    consistent API across platforms and allows 64-bit offsets to
    be used even when libc::off_t is 32-bit. This is similar to
    std::fs, which uses u64 for file positions. Nix uses i64
    because off_t is a signed type. Into<i64> is used where appropriate
    so that existing code that uses libc::off_t will continue to
    work without changes.
 2. A largefile_fn macro that is used to select the large-file-capable
    version of a function. E.g. largefile_fn![pwrite] is equivalent
    to libc::pwrite64 on glibc/Linux and plain libc::pwrite on
    systems that don't have or need libc::pwrite64.
 3. A new require_largefile macro that is used to skip tests that
    require libc::off_t to be larger than 32 bits.
 4. Changes to fallocate, ftruncate, lseek, mmap, open, openat,
    posix_fadvise, posix_fallocate, pread, preadv, pwrite, pwritev,
    sendfile, and truncate, making them support large files.
 5. A set of test_*_largefile tests to verify that each of the
    previously mentioned functions now works with files whose size
    requires more than 32 bits to represent.
A few functions are still limited to 32-bit file sizes after this
change. This includes the aio* functions, the *stat* functions, and
mkstemp(). Adding large file support to those requires a bit more
work than simply calling a different function and is therefore left
for later.
    | @asomers, are you happy to take another look at this? As we discussed, I've changed to using i64 for offsets. I've also extended the change to cover Android, which on 32-bit also has 32-bit off_t and the *64 largefile API. With the most recent run, all checks pass. | 
| Thanks for the detailed write-up, I haven't taken a closer look at the patch code, but the idea generally looks good to me. From the libc crate source code, those  
 We don't need to support that combination, Nix, as a consumer of the libc crate, should follow and trust it to some extent. Since  I hope I can review this PR this weekend. | 
| @SteveLauC, thank you for taking a look! Did you get a chance this past weekend? | 
| 
 No, I was quite down during the last 3 weeks, which makes me can barely handle open-source stuff, will review this today:) | 
| The libc crate wants to deal with this problem in version 1.0, which would bring huge breaking changes to its code base, Nix is a big consumer of the libc crate, so I am kinda worried that if we handle this before the libc crate takes further action, it would be a big refactor for us as well when we switch to libc 1.0. Though I would say libc 1.0 would land in the near future, LFS is hard since it differs from the platforms and varies with different C pre-processor configs even in the same platform. Ref: 
 | 
| @SteveLauC, thank you for taking a look and adding the pointers to the libc crate's proposed future direction. I do appreciate not wanting to do a big refactor in the future. On the other hand, I don't expect that this will turn out to be particularly painful, for a couple of reasons: 
 The biggest risk I see is that nix's API will look a bit different from libc's. For example, you could imagine the libc crate deciding on the approach that calling a libc crate function should always call the function of the same name in the C library (i.e. pwrite would call pwrite, not pwrite64). This is different from what this PR does, which is always to call the 64-bit variant (whether that has 64 in the name or not). Arguably, if libc and nix take different paths here, that might cause some confusion. Having said the above, I think it does make sense for nix to always use the 64-bit functions, whether or not the libc crate ends up taking the same approach. After all, nix aims to provide a friendly wrapper and to unify what can be unified. I would say having pwrite always mean the 64-bit pwrite is consistent with that. It makes code written against nix more portable, because the offsets are always 64 bits. Moreover, it avoids the footgun of having nicely named, available on all platforms functions that will suddenly fail if your files grow beyond a certain size, and having to remember to add "64" to the names where available. Given that nix aims to be a friendly wrapper around libc, I would argue that the right thing for it to do is to use the 64-bit libc functions where available. In summary, I think standardizing on 64-bit file offsets is the right thing to do for nix, and I think we don't have to wait for the libc crate to make that move. | 
Adds large file support (64-bit file positions) to the existing Nix
API when targetting systems that have separate 32-bit and 64-bit
APIs.
On many platforms that support 64-bit file positions, this support is
present in the standard I/O functions. On others, there have
historically been two APIs: the standard API, which limits file size to
2**31-1 bytes, and a separate API that suffixes function names with "64"
and supports 64-bit file sizes. This "transitional interface for 64-bit
file offsets" is not present on every platform. As a result, using the
*64 API will make programs non-portable, whereas using the standard API
will result in programs that cannot handle large file offsets on some
systems, even if those systems actually do have a way to handle
large file offsets.
This change enables 64-bit file offsets on the following platforms
where the standard API only provides 32-bit offsets:
To support large files with a consistent API across platforms, this
change makes Nix functions call the 64-bit capable equivalents where
necessary. Other C libraries may not provide the *64 API, so we continue
to call the standard functions on those.
Broadly, the change consists of 5 parts:
Uses of libc::off_t have are replaced by i64. This provides a
consistent API across platforms and allows 64-bit offsets to
be used even when libc::off_t is 32-bit. This is similar to
std::fs, which uses u64 for file positions. Nix uses i64
because off_t is a signed type. Into is used where appropriate
so that existing code that uses libc::off_t will continue to
work without changes.
A largefile_fn macro that is used to select the large-file-capable
version of a function. E.g. largefile_fn![pwrite] is equivalent
to libc::pwrite64 on glibc/Linux and plain libc::pwrite on
systems that don't have or need libc::pwrite64.
A new require_largefile macro that is used to skip tests that
require libc::off_t to be larger than 32 bits.
Changes to fallocate, ftruncate, lseek, mmap, open, openat,
posix_fadvise, posix_fallocate, pread, preadv, pwrite, pwritev,
sendfile, and truncate, making them support large files.
A set of test_*_largefile tests to verify that each of the
previously mentioned functions now works with files whose size
requires more than 32 bits to represent.
A few functions are still limited to 32-bit file sizes after this
change. This includes the aio* functions, the stat functions, and
mkstemp(). Adding large file support to those requires a bit more
work than simply calling a different function and is therefore left
for later.